197 research outputs found

    Robust detection of alternative splicing in a population of single cells

    Get PDF
    Single cell RNA-seq experiments provide valuable insight into cellular heterogeneity but suffer from low coverage, 3′ bias and technical noise. These unique properties of single cell RNA-seq data make study of alternative splicing difficult, and thus most single cell studies have restricted analysis of transcriptome variation to the gene level. To address these limitations, we developed SingleSplice, which uses a statistical model to detect genes whose isoform usage shows biological variation significantly exceeding technical noise in a population of single cells. Importantly, SingleSplice is tailored to the unique demands of single cell analysis, detecting isoform usage differences without attempting to infer expression levels for full-length transcripts. Using data from spike-in transcripts, we found that our approach detects variation in isoform usage among single cells with high sensitivity and specificity. We also applied SingleSplice to data from mouse embryonic stem cells and discovered a set of genes that show significant biological variation in isoform usage across the set of cells. A subset of these isoform differences are linked to cell cycle stage, suggesting a novel connection between alternative splicing and the cell cycle

    SLICER: inferring branched, nonlinear cellular trajectories from single cell RNA-seq data

    Get PDF
    Accuracy of trajectory reconstruction using a subset of cells. (a) Graph showing how similar the SLICER trajectory is when computed using a random subset of lung cells. The blue bars show the similarity in cell ordering (units are percent sorted with respect to the trajectory constructed from all cells). The orange bars show the similarity in branch assignments (percentage of cells assigned to the same branch as the trajectory constructed from all cells). The values shown were obtained by averaging the results from five subsampled datasets for each percentage (80 %, 60 %, 40 %, and 20 %). (b) Order preservation and branch identity values computed as in panel (a), but for datasets sampled from the neural stem cell dataset. (PDF 106 kb

    TUT7 catalyzes the uridylation of the 3′ end for rapid degradation of histone mRNA

    Get PDF
    The replication-dependent histone mRNAs end in a stem–loop instead of the poly(A) tail present at the 3′ end of all other cellular mRNAs. Following processing, the 3′ end of histone mRNAs is trimmed to 3 nucleotides (nt) after the stem–loop, and this length is maintained by addition of nontemplated uridines if the mRNA is further trimmed by 3′hExo. These mRNAs are tightly cell-cycle regulated, and a critical regulatory step is rapid degradation of the histone mRNAs when DNA replication is inhibited. An initial step in histone mRNA degradation is digestion 2–4 nt into the stem by 3′hExo and uridylation of this intermediate. The mRNA is then subsequently degraded by the exosome, with stalled intermediates being uridylated. The enzyme(s) responsible for oligouridylation of histone mRNAs have not been definitively identified. Using high-throughput sequencing of histone mRNAs and degradation intermediates, we find that knockdown of TUT7 reduces both the uridylation at the 3′ end as well as uridylation of the major degradation intermediate in the stem. In contrast, knockdown of TUT4 did not alter the uridylation pattern at the 3′ end and had a small effect on uridylation in the stem–loop during histone mRNA degradation. Knockdown of 3′hExo also altered the uridylation of histone mRNAs, suggesting that TUT7 and 3′hExo function together in trimming and uridylating histone mRNAs

    The word landscape of the non-coding segments of the Arabidopsis thaliana genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome sequences can be conceptualized as arrangements of motifs or words. The frequencies and positional distributions of these words within particular non-coding genomic segments provide important insights into how the words function in processes such as mRNA stability and regulation of gene expression.</p> <p>Results</p> <p>Using an enumerative word discovery approach, we investigated the frequencies and positional distributions of all 65,536 different 8-letter words in the genome of <it>Arabidopsis thaliana</it>. Focusing on promoter regions, introns, and 3' and 5' untranslated regions (3'UTRs and 5'UTRs), we compared word frequencies in these segments to genome-wide frequencies. The statistically interesting words in each segment were clustered with similar words to generate motif logos. We investigated whether words were clustered at particular locations or were distributed randomly within each genomic segment, and we classified the words using gene expression information from public repositories. Finally, we investigated whether particular sets of words appeared together more frequently than others.</p> <p>Conclusion</p> <p>Our studies provide a detailed view of the word composition of several segments of the non-coding portion of the <it>Arabidopsis </it>genome. Each segment contains a unique word-based signature. The respective signatures consist of the sets of enriched words, 'unwords', and word pairs within a segment, as well as the preferential locations and functional classifications for the signature words. Additionally, the positional distributions of enriched words within the segments highlight possible functional elements, and the co-associations of words in promoter regions likely represent the formation of higher order regulatory modules. This work is an important step toward fully cataloguing the functional elements of the <it>Arabidopsis </it>genome.</p

    Selective single cell isolation for genomics using microraft arrays

    Get PDF
    Genomic methods are used increasingly to interrogate the individual cells that compose specific tissues. However, current methods for single cell isolation struggle to phenotypically differentiate specific cells in a heterogeneous population and rely primarily on the use of fluorescent markers. Many cellular phenotypes of interest are too complex to be measured by this approach, making it difficult to connect genotype and phenotype at the level of individual cells. Here we demonstrate that microraft arrays, which are arrays containing thousands of individual cell culture sites, can be used to select single cells based on a variety of phenotypes, such as cell surface markers, cell proliferation and drug response. We then show that a common genomic procedure, RNA-seq, can be readily adapted to the single cells isolated from these rafts. We show that data generated using microrafts and our modified RNA-seq protocol compared favorably with the Fluidigm C1. We then used microraft arrays to select pancreatic cancer cells that proliferate in spite of cytotoxic drug treatment. Our single cell RNA-seq data identified several expected and novel gene expression changes associated with early drug resistance

    Pseudogenes transcribed in breast invasive carcinoma show subtype-specific expression and ceRNA potential

    Get PDF
    BackgroundRecent studies have shown that some pseudogenes are transcribed and contribute to cancer when dysregulated. In particular, pseudogene transcripts can function as competing endogenous RNAs (ceRNAs). The high similarity of gene and pseudogene nucleotide sequence has hindered experimental investigation of these mechanisms using RNA-seq. Furthermore, previous studies of pseudogenes in breast cancer have not integrated miRNA expression data in order to perform large-scale analysis of ceRNA potential. Thus, knowledge of both pseudogene ceRNA function and the role of pseudogene expression in cancer are restricted to isolated examples.ResultsTo investigate whether transcribed pseudogenes play a pervasive regulatory role in cancer, we developed a novel bioinformatic method for measuring pseudogene transcription from RNA-seq data. We applied this method to 819 breast cancer samples from The Cancer Genome Atlas (TCGA) project. We then clustered the samples using pseudogene expression levels and integrated sample-paired pseudogene, gene and miRNA expression data with miRNA target prediction to determine whether more pseudogenes have ceRNA potential than expected by chance.ConclusionsOur analysis identifies with high confidence a set of 440 pseudogenes that are transcribed in breast cancer tissue. Of this set, 309 pseudogenes exhibit significant differential expression among breast cancer subtypes. Hierarchical clustering using only pseudogene expression levels accurately separates tumor samples from normal samples and discriminates the Basal subtype from the Luminal and Her2 subtypes. Correlation analysis shows more positively correlated pseudogene-parent gene pairs and negatively correlated pseudogene-miRNA pairs than expected by chance. Furthermore, 177 transcribed pseudogenes possess binding sites for co-expressed miRNAs that are also predicted to target their parent genes. Taken together, these results increase the catalog of putative pseudogene ceRNAs and suggest that pseudogene transcription in breast cancer may play a larger role than previously appreciated.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-1227-8) contains supplementary material, which is available to authorized users

    Percutaneous mechanical circulatory support and survival in patients resuscitated from Out of Hospital cardiac arrest: A study from the CARES surveillance group

    Get PDF
    INTRODUCTION: Maintenance of cardiac function is required for successful outcome after out-of-hospital cardiac arrest (OHCA). Cardiac function can be augmented using a mechanical circulatory support (MCS) device, most commonly an intra-aortic balloon pump (IABP) or Impella®. OBJECTIVE: Our objective is to assess whether the use of a MCS is associated with improved survival in patients resuscitated from OHCA in Michigan. METHODS: We matched cardiac arrest cases during 2014-2017 from the Cardiac Arrest Registry to Enhance Survival (CARES) in Michigan and the Michigan Inpatient Database (MIDB) using probabilistic linkage. Multilevel logistic regression tested the association between MCS and the primary outcome of survival to hospital discharge. RESULTS: A total of 3790 CARES cases were matched with the MIDB and 1131 (29.8%) survived to hospital discharge. A small number were treated with MCS, an IABP (n = 183) or Impella® (n = 50). IABP use was associated with an improved outcome (unadjusted OR = 2.16, 95%CI [1.59, 2.93]), while use of Impella® approached significance (OR = 1.72, 95% CI [0.96, 3.06]). Use of MCS was associated with improved outcome (unadjusted OR = 2.07, 95% CI [1.55, 2.77]). In a multivariable model, MCS use was no longer independently associated with improved outcome (OR(adj) = 0.95, 95% CI [0.69, 1.31]). In the subset of subjects with cardiogenic shock (N = 725), MCS was associated with improved survival in univariate (unadjusted OR = 1.84, 95% CI [1.24, 2.73]) but not multi-variable modeling (OR(adj) = 1.14, 95% CI [0.74, 1.77]). CONCLUSION: Use of MCS was infrequent in patients resuscitated from OHCA and was not independently associated with improvement in post arrest survival after adjusting for covariates
    • …
    corecore